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Asymptotic  Theory  and  Econometric  Practice 
Roger  Koenker 


The  classical  paradigm  of  asymptotic  theory  rests  on  the  following  "willing  suspension  of 
disbelief."  We  are  asked  to  imagine  a  colleague  with  an  extremely  diligent  research  assistant  in 
the  throes  of  specifying  an  econometric  model.  Daily,  the  RA  arrives  with  buckets  full  of 
independent  new  observations,  but  our  colleague  is  so  uninspired  by  curiosity  and  convinced 
of  the  validity  of  his  original  model,  that  each  day  he  simply  reestimates  this  original  model 
without  alteration  using  larger  and  larger  samples. 

We  estimate  a  poisson  model  of  the  specification  of  wage  equations  in  the  econometric 
literature  based  on  a  sample  of  733  equations  from  156  papers.  The  results  strongly  suggest 
that  the  classical  paradigm  is  seriously  flawed:  the  number  of  parameters  estimated  in  wage 
equations,  say  p  tends  to  infinity  as  the  sample  size  tends  to  infinity,  and,  roughly,  pA/n  tends 
to  a  constant.  Should  we  abandon  our  cherished  beliefs  in  consistency  and  asymptotic  nor- 
mality to  the  dustbin  of  irrelevance? 

On  the  contrary,  the  forthright  admission  that  in  realistic  econometric  settings  p  — *  oo 
with  n,  offers  an  opportunity  for  an  even  more  challenging  (and  informative)  asymptotic 
theory.  Huber(1973)  was  apparently  the  first  to  observe  that  under  rather  mild  regularity  con- 
ditions on  the  sequence  of  designs  consistency  and  asymptotic  normality  of  the  least  squares 
estimator  in  linear  models  was  possible  if  p/n  — ►  0.  Portnoy(  1984, 1985)  extending  results  of 
Huber  and  others  has  shown  that  similar  results  may  be  established  for  a  broad  class  of  M- 
estimators  for  linear  models  when  p  log{p)/n  -*  0.  We  survey  these  results  and  report  on 
some  similar  results  for  L-estimators  for  the  linear  model  of  the  type  proposed  in  Koenker  and 
Bassett(1978). 


1.    Introduction 

The  classical  paradigm  of  asymptotic  theory  in  econometrics  rests  on  the  following  "wil- 
ling suspension  of  disbelief."  We  must  imagine  a  colleague  in  the  throes  of  specifying  an 
econometric  model.  Daily,  an  extremely  diligent  research  assistant  arrives  with  hundreds  of 
(independent)  new  observations,  but  our  imaginary  colleague  is  so  uninspired  by  curiosity  and 
convinced  of  the  validity  of  his  original  model,  that  each  day  he  simply  reestimates  his  primal 
model-without  alteration-employing  his  ever-larger  samples. 

Is  this  a  plausible  meta-model  of  econometric  model  building?  Casual  observation  sug- 
gets  that  it  is  not.  The  parametric  dimension  of  econometric  models  seems  to  expand  inexor- 
ably as  larger  samples  tempt  the  researcher  to  ask  new  questions  and  refine  old  ones.  Indeed, 
this  natural  temptation  is  formally  justified  by  the  extensive  literature  on  pre-testing  and 
model  selection.  As  larger  samples  improve  the  precision  of  our  estimates,  our  willingness  to 
accept  bias  in  exchange  for  further  improvements  in  precision  inevitably  declines. 

In  the  next  section  we  propose  a  simple,  yet  we  hope  plausible,  meta-model  of  the 
econometric  model  specification  process.  And  we  present  some  empirical  evidence  on  the 
specification  of  models  of  wage  determination.  We  conclude  from  this  exercise  that  the 
parametric  dimension  of  wage  models  grows  roughly  like  the  fourth  root  of  the  sample  size. 
The  hypothesis  of  classical  asymptotic  theory  that  parametric  dimension  is  fixed,  i.e.,  indepen- 
dent of  sample  size,  is  decisively  rejected. 

Should  we  abandon  our  cherished  beliefs  in  the  consistency  and  asymptotic  normality  of 
econometric  methods?  Are  the  approximations  suggested  by  fixed-/?  asymptotic  theory 
"irrelevant"  to  the  "real  world"  of  econometric  practice?  In  Section  3  we  argue,  on  the  con- 
trary, that  the  forthright  admission  that  p  —  oo  with  n,  offers  an  opportunity  for  a  challenging 
and  much  more  informative  new  form  of  asymptotic  theory.  We  begin  by  reviewing  results  of 
Huber  (1973)  on  the  large  sample  theory  of  the  least  squares  estimator  in  linear  models  with 
p— oo.   The  recent  results  of  Yohai  and  Marrona  (1979)  and  Portnoy  (1984)  on  Iarge-p  asymp- 


totics  for  other  m-cstimators  are  then  surveyed.  And  we  conclude  with  some  remarks  about 
extending  these  results  to  the  1-estimators  for  linear  models  introduced  in  Koenkcr  and  Bassctt 
(1978). 

2.  Econometric  Practice:  A  Meta-Model  of  Wage  Determination  Models 

Models  of  wage  determination  offer  an  unusually  rich  and  revealing  source  of  data  on 
the  practice  of  model  specification  in  econometrics.  The  "wage  equation"  pervades  the 
applied  econometrics  literature;  models  of  discrimination  in  employment,  the  effects  of  unions, 
returns  to  eductaion  and  of  wage  determination.  The  development  of  several  large  scale  panel 
surveys  of  labor  market  experience  has  facilitated  the  rapid  growth  of  this  empirical  literature. 

A  meta-model  is,  of  course,  a  model  of  models.  As  suggested  in  the  previous  section,  we 
are  primarily  interested  in  modeling  the  dependence  of  the  parametric  dimension  of  models, 
say  p,  on  the  sample  size  of  the  available  data,  say  n.  Since  the  proposed  dependent  variabie, 
p,  is  inherently  a  positive  integer  it  is  natural  to  begin  with  Poisson  models  in  which  the  inten- 
sity (or  rate)  is  taken  to  be  some  parametric  function  of  the  sample  size  and  perhaps  other 
characteristics  of  the  research. 

The  data  which  we  will  analyze  consists  of  733  wage  equations  reported  in  156  papers  in 
mainstream  economics  journals  and  essay  collections  over  the  period  1970  to  1980.  For  each 
equation  we  observe  the  number  of  parameters  estimated,  the  sample  size,  date  of  publication, 
and  subject  classified  into  four  categories.  We  also  record  the  number  of  equations  reported 
in  each  paper  which  is  used  to  weight  the  observations.  Inevitably,  there  are  ambiguities  in 
interpretation  of  the  data.  What  constitutes  an  equation?  Usually,  this  is  quite  straightfor- 
ward, however,  occasionally  one  sees  samples  split  by  age,  race,  sex,  etc.,  and  estimated  with 
and  without  homogeneity  constraints  on  the  coefficients.  Our  policy  in  these  cases  was  to 
interpret  the  disaggregated  form  of  the  equation  as  a  single  equaton  with  say,  mp,  parameters, 
not  as  m  distinct  equations  with  p  parameters.  Frequently,  there  are  non-wage  equations  in 
the  surveyed  papers;  these  are  remorselessly  ignored.    Equations  must  have  wage,  or  some 


function  of  wage  as  the  dependent  variable.  Throughout,  we  have  weighted  observations  on 
equations  by  the  reciprocal  of  the  number  of  equations  appearing  in  the  published  paper.  This 
tends  to  alleviate  the  problem  of  over-representation  in  the  sample  by  a  few  (candid)  "fishing" 
enthusiasts  who  report  a  large  number  of  equations  in  a  single  paper. 

It  would  be  barbaric  in  the  extreme  to  adopt  a  notation  in  which  p  was  regressed  on  n, 
so  we  will  revert  to  the  more  civilized  convention  of  denoting  our  observed  dependent  vari- 
able by  y,  the  sample  size  variable  will  be  denoted  z,  and  the  vector  of  explanatory  variables 
will  be  denoted  x.  Our  meta  sample  size,  733,  may  thus  be  denoted  simply  as  n,  and  the 
dimension  of  x  by  p.  This  notational  recursion  makes  the  world  safe  for  meta-meta- 
econometrics. 

For  the  Poisson  model  we  may  write,  for  a  typical  observation 

P(Y=y)  =  e~x\<'/y\ 

with  the  rate  parameter  A  is  expressed,  e.g.,  as, 

A  =  exp(x£)  =  exp  fa  +  fa  Io8  - 

In  this  form,  the  expectation  and  variance  of  the  random  variable  Y  are  both  equal  to  the 
value  A.  This  is  not  entirely  implausible  since  we  might  expect  that  the  dispersion  of  model 
size  would  increase  with  its  expectation.  The  Poisson  hypothesis  is  obviously  much  stronger 
than  this  vague  presumption  of  monotonicity  and  may  be  subjected  to  explicit  test.  This  prob- 
lem is  addressed  below. 

The  simplest,  and  therefore  perhaps  the  most  compelling,  of  our  estimated  meta-models 
yields1 


1  All  estimation  of  Poisson  models  reported  in  this  paper  was  carried  out  in  the  GLIM 
(Generalized  Linear  Interactive  Modeling/System  Release  3  Baker  and  Nelder  (1978)  see  also 
McCullough  and  Nelder  (1983).  Reported  standard  errors  beneath  the  coefficients  in  all  pois- 
son models  are  based  on  the  GLIM  quasi-likelihood  model  in  which  V{Y)  =  c^EiY)  with  a2  a 
free  parameter.  It  should  be  emphasized  that  in  cases  on  overdispersion  (a2>\)  strict  adher- 
ence to  the  Poisson  assumption  that  a2  =  1  can  seriously  bias  standard  errors  toward  zero. 


log  A  =  1.336  +  0.235  log  z  n  n 

(0.149)  (.017)  v     '     ' 

Thus,  roughly  speaking,  a  1%  increase  in  the  sample  size  of  a  wage  determination  model 
induces  a  1/4%  increase  in  the  number  of  parameters  of  the  model.  This  parsimony  elasticity, 
or  for  the  sake  of  brevity,  "parsity,"  is,  perhaps,  the  critical  parameter  of  meta-econometrics.  It 
will  be  denoted  as  ir  below.  To  put  it  slightly  differently,  p/n1^  is  roughly  constant  (=  log 
1.336  «  4.)  over  the  range  of  observed  wage  equation  models.  It  should  be  emphasized  that 
the  hypothesis  of  classical  asymptotic  theory  that  the  dimension  of  parametric  models  is 
independent  of  sample  size:  fi2  -  0  in  (2.1)  is  decisively  rejected  by  the  data. 

Our  simple  bivariate  model  is  unsatisfactory  in  several  respects: 

1.)  It  predicts  poorly  for  small  n,  implying  extravagently  prodigal  models  for  n  <  100, 
and  negative  degrees  of  freedom  for  n  <  10  . 

2.)  The  model,  in  GLIM  terminology,  is  seriously  overdispersed,  i.e.,  the  Poisson  hy- 
pothesis that  V(Y)  =  E(Y)  is  not  supported  by  the  data.  The  usual  GLIM  diagnostic 
is  the  estimated  scale  parameter 

a  =  (n-p)-1i:(yi-\i)2/\i 

is  4.73  in  this  case  and  significantly  different  from  the  hypothesized  value  of  one. 

3.)  There  are  a  few  highly  influential  observations  with  z.'s  (sample  sizes)  above 
500,000. 

Thus  the  narrow  confidence  interval  on  the  coefficient  of  log  z  in  (2.1)  constructed  con- 
ditional on  this  specification  of  the  meta-model  is  far  too  optimistic.  We  have  experimented 
with  several  alternate  functional  forms  of  the  model  for  the  conditional  expectation  of  model 
size.  The  obvious  tactic  of  introducing  a  log  quadratic  term  is  (unfortunately)  extremely  sen- 
sitive to  the  observations  alluded  to  in  point  (3.)  above.  With  those  observations,  we  obtain, 

log  A  =  -.438  +  .663  logz  -.0245  (log  z  f  n  2) 

(512)  (.118)  (.0067) 

while  without  them  we  have, 

log  A  =  1.737  +  .0581  log  z  +  .01543  (log  zf  (2  3) 

(512)  (.128)  (.0078) 

In  the  former  the  model  predicts  that  model  size  declines  after  roughly  n  =  100,000,  whereas 


the  latter  implies  smoothly  increasing  parsity.  In  both  cases  the  parsity  at  mean2  sample  size 
(n  w  1000)  is  roughly  comparable  to  our  simple  model,  n  =  .32  for  (2.2)  and  n  =  .27  for  (2.3).  It 
is  admittedly  disturbing  to  find  that  the  rise  and  fall  of  parsity  is  so  sensitive  to  a  few  large- 
sample  observations  from  our  meta-sample.  However,  such  sensitivity,  especially  in  quadratic 
models,  is  often  inevitable.  Further,  one  may  wish  to  question  whether  the  observations  with 
n  >  250,000  are  really  drawn  from  the  same  population  as  the  other  observations  of  our  meta- 
sample.  For  these  cases,  computational  considerations  enter  the  model  specification  process  in 
a  nontrivial  way  and  may  eventually  come  to  dominate  the  "scientific"  considerations  which 
we  emphasized  in  Section  l.3  Thus  we  believe  that  there  should  be  some  a  priori  preference  for 
(2.3)  over  (2.2). 

We  have  also  experimented  with  models  in  log  (log  n).  The  estimated  model 

log  A  =  -.777  f  1.947  log  log  z  n  4) 

(315)  (.148)  K         ' 

yields  a  slighly  better  fit  than  our  simple  meta-model  (2.1)  and  at  mean  sample  size  it  implies  a 
parsity  of  ir  =  .28.  The  log-log  form  has  the  attractive  feature  that  the  parsity  parameter  is  pro- 
portional to  the  reciprocal  of  log  (sample  size),  and  therefore  tends  to  zero  as  n— ><x>.  Figure 
2.1  illustrates  the  differences  among  the  four  models  reported  above  with  respect  to  parsity  as 
a  function  of  sample  size.  One  sees  clearly  in  the  figure  that  the  differences  between  the  func- 
tional forms  are  primarily  in  the  extremes  of  the  observed  sample  sizes. 

We  have  emphasized  above  that  all  of  the  Poisson  models  suffer  from  over-dispersion, 
that  is  the  estimated  variance  of  dependent  variable  is  roughly  3-4  times  the  mean  that  is 
predicted  by  the  Poisson  model.  One  interpretation  of  this  overdispersion  in  Poisson  models 
is  that  there  is  some  inherent  variability  in  the  rate  parameter  A  around  its  hypothesized  (log) 


2  Since  sample  sizes  are  logged  this  mean  («  =  1 04 1 )  is  geometric. 

1  This  comment  may  seem  to  undercut  our  contention  that  p  — >oo  with  n,  which  if  taken 
absolutely  literally  is  evidently  asymptotically  computationally  infeasible.  Of  course,  what  is 
relevant  is  what  happens  in  the  range  of  practical  experience  for  which  conventional  asymp- 
totic theory  is  expected  to  provide  a  guide;  in  the  case  of  wage  equations  this  seems  to  be 
roughly  sample  sizes  in  the  range  50-500,000.  Here  the  evidence  seems  overwhelming  that  p 
increasing  gradually  with  n. 


linear  form.  The  classical  approach  to  treating  this  (common)  syndrome  is  to  hypothesize  a 
gamma  distribution  for  the  intercept  of  the  rate  equation,  and  on  integrating  out  this  random 
parameter  one  obtains  a  negative  binomial  model  for  the  dependent  variable.  See  e.g.  John- 
son and  Kotz(1969)  and  the  references  cited  there.  This  approach  may  be  traced  to 
Anscombe  (1949)  who  applied  it  in  entomology,  a  recent  application  in  econometrics  is  Haus- 
man  and  Griliches  (1983),  and  an  extremely  insightful  view  of  this  problem  and  parametric 
heterogeneity  in  general  is  provided  by  Cox  (1984).  This  interpretation  is  also  set  forth  in 
Cheshire  (1984). 

Tests  for  parametric  heterogeneity  in  Poisson  models  may  be  developed  along  the  lines 
suggested  by  Lancaster  (1984)  based  on  Cheshire  (1984),  White  (1982),  Cox  (1984)  and  others. 
The  basic  information  identity 

f 


3  log/    d  log  /  '  | 
36  as 


u  =  E »*-  +  E 

aeae' 

and  its  extensions  may  be  used  to  construct  tests  which  are  readily  computed  as  nR2  from  a 
regression  of  a  column  of  ones  on  a  matrix  of  n  by  p(/?  +  l)/2  elements  of  D  augmented  by  the 
matrix  of  gradient  "observations"  g  =31og/  /dd  evaluated  at  the  mle.  "Explanatory  power"  in 
this  regression  suggests  systematic  departures  in  the  fitted  model  from  the  hypothesis  that  D 
and  g  have  zero  expectation.  We  have  conducted  a  number  of  these  tests  restricting  attention 
to  the  components  of  [Dg]  corresponding  to  the  intercept  parameter  in  the  log  A  equation. 
Here  the  test  is  particularly  simple  since 

i=(^-A,)2-A, 

and 

gi  =  y*  -  K 

where  £.  =  e'iP.  The  test  statistic  is  133.1  for  meta-model  (2.1)  for  example,  which  is  clearly 
an  implausible  value  for  a  x2  on  2  degrees  of  freedom  variable. 


Unfortunately,  the  negative  binomial  model  while  quite  attractive  from  a  number  of  per- 
spectives is  quite  unwieldy  computationally.  Some  initial  forays  have  been  made  using  the 
remarkable  quasi-mle  software  of  Spady  (1984).  This  approach  is  capital  intensive,  but  avoids 
the  difficulties  of  coding  analytical  derivatives,  and  has  the  singular  virtue  of  producing 
numerically  reliable  standard  errors.4  In  the  simple  log  linear  model,  we  obtain 

log  a,  =  1.039  +  .2721ogz 

(.22)  (.033) 

with  j3  =  1.51  (.13).  The  parsity  parameter  in  this  model  is  independent  of  z  and  at  .272 
roughly  the  same  as  in  the  simple  Poisson  model.  Negative  binomial  models  using  other 
specifications  of  the  conditional  mean  function  also  produce  results  closely  resembling  their 
poisson  counterparts. 

3.  Asymptotic  Theory:  A  Practical  Paradigm 

We  are  faced  with  a  great  dialetical  discrepancy.  Theory  offers  us  a  static  view  of  the 
econometric  model,  a  model  "cast  in  concrete,"  unperturbed  by  the  influx  of  new  data.  The 
practice  of  econometrics,  however,  offers  quite  a  different,  more  plastic,  view:  models  gradu- 
ally expanding  and  elaborating  themselves  in  response  to  the  availability  of  new  data.  How 
are  these  views  to  be  reconciled? 

The  answer,  of  course,  is  to  expand  the  paradigm  of  classical  asymptotic  theory.  Huber 
(1973)  was  apparently  the  first  to  observe  that,  under  rather  mild  regularity  conditions  on  the 
sequence  of  designs,  consistency  and  asymptotic  normality  of  the  least-squares  estimator  in 
linear  models  was  possible  if  p  In  — ►().  These  results  are  quite  elementary,  on  the  same  level  as 
the  fixed  p  asymptotics  which  are  done  in  introductory  graduate  courses,  and  therefore 
should  be  better  known.  To  my  knowledge,  only  the  recent  text  of  Amemiya  (1985)  treats  any 
of  these  questions  and  even  there  the  implications  are  only  implicit. 


4  Standard  errors  are  computed  by  numerical  approximations  to  the  general  quasi-mlc 
formula  V  =  J~l  I  J~l  where  /  denotes  Ed\ogf  /ded\ogf  /dO'  and  J  denotes  Ed^ogf  /dOdO'. 


8 

To  illustrate  the  general  approach  consider  the  simplest  application  the  classical  linear 
model  with  iid  disturbances,  and  the  asymptotic  behavior  of  the  least-squares  estimator.  For 
fixed  p,  and  error  distributions  with  finite  variance,  we  know  that  f3-+/30,  strongly  iff 
(XX)~i-*0.  See  Lai,  Robbins  and  Wei  (1979),  for  a  proof  this  is  a  surprisingly  delicate  and 
difficult  result.  For  />-k»  with  n,  consider  the  "hat"  matrix5  H  =  X{X'X)~lX'  We  know  the 
following:  ha  e  [0,1] ,  tr(H )  -  p  ,  HH  =  H  Thus,  since  p=Hy,we  have 

Var($i)  =  £lhZo2  =  hiio*  (3.1) 

*=/ 

so  by  Chebyshev's  inequality 

PUh-EPi]  >/i,-4  (3-2) 

Proposition  3.1.  (Huber)  £,  is  weakly  consistent,  i.e.,  9i—*pXifi  iff  h{— »0. 
Proof.  Sufficiency  above.  Necessity: 


(3.3) 


For  independent  random  variables,  X,  Y, 

P[\X+Y  \>e]  >  P[X  >  e]P[Y>0]  +  P[X<-e]P[Y<0]  >  mm{P[X>e],P[X<-e]) 
so 

P[\Pi-EPi  |  >  e]  >  mmiPlUiZe/ki]  ,/>K<-e//it]} 
and  if /it-+0,  then  the  rhs  is  bounded  away  from  zero  ■ 

Note  that  h  =  max,  ,  |/rit  \>—T,hii  =  —Tr{H)  =  p/n,  so  h->0—>p/n—>0  so  p /n>0  is  neces- 

n  n 

sary,  but  not  sufficient  for  weak  consistency. 


5  This  terminology  is  due  to  Tukey  and  may  be  attributed  to  the  fact  that  H  "puts  the  hat 
on  v",  i.e.,  9=Hy. 


Now  consider  an  arbitrary  linear  function  of  /?,  say  a'fi,   \\a\\  =  1.  Assume  F  isn't  Gaus- 
sian, and  reparameterize  so  that 


X'X  =  IP 


Hence, 


and 


P-x'y 


a  =  a'fi  =  a  X  v  =  s  y 


where 


s's  =a'X'Xa  =  1 


SO 


Var(a)  =  <? 

Proposition  3.2.  (Huber)  a  is  asymptotically  Gaussian,  iff  J"  =  max,  \s{\  ->  0. 

Proof.  If  T— »0  then  either  a  doesn't  have  a  limiting  distribution  or  it  is  a  convolution  of 
two  parts:  one  of  which  is  F,  thus  not  Gaussian,  by  hypothesis.  If  s  =  max  |s,  |— <-0,then  the 
Lindeberg  condition  is, 

cr  cr 

-—Eu2I{\u  |  >  ea/T)     (since  s  's  =  1) 
cr 

=— >0  since   T-*0 

A 

Proposition  3.3  (Bickel)  Estimable  functions  a'/9,  are  asymptotically  Gaussian  with 
natural  parameters  iff  the  fitted  values  are  consistent. 


so 
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Proof.  (Huber)  Since  X'X  =  I 

5,2  =  (ExtJkafc)2<(Exi)(Eafc2)  =  /i|. 

ft-frO->J-»0«>a-»"G 

This  establishes  the  "if,  the  "only  if  follows  from  the  decomposition  (3.3)  and  the  hypothesis 
that  m,  is  not,  itself,  Gaussian  ■ 

These  results  for  the  least  squares  estimator  are  extremely  encouraging.  What  happens  in 
nonlinear  cases?  The  simplest  nonlinear  case  is  robust  regression  for  linear  models.  Here  all 
the  nonlinearity  seems  to  be  very  well  circumscribed,  however,  already,  serious  difficulties 
arise.  Huber  (1973),  on  the  basis  of  informal  expansions  and  Monte  Carlo,  onjectured  that 
p2/n-+0  was  necessary  to  achieve  a  uniform  normal  approximation  for  a  typical  m-estimator 
in  the  absence  of  any  symmetry  conditions  on  the  error  distribution.  Subsequently,  Yohai  and 
Marrona  (1979)  showed  that  p3/2h—>0  implied  a  uniform  normal  approximation,  but  this 
means,  since  h=0(p/n),  that  pbl2/n  would  be  sufficient.  Huber  (1981)  conjectured  that 
ph -+0  was  sufficient  and  that  yfph  — >0  was  necessary  if  the  error  distribution  was  permitted  to 
be  asymmetric.  For  symmetric  errors  one  might  hope  that  h  — ►()  was  sufficient  as  in  the  least- 
squares  case.  Huber  (1981)  contains  an  elementary  proof  for  the  case  p2h—>0. 

Portnoy  (1984,  1985)  has  recently  improved  these  results  and  verified  an  important  con- 
jecture of  Huber.  In  particular,  he  shows  that  under  reasonably  mild  conditions  on  X,6 
p(logn)/n—>0,  suffices  for  norm  consistency  of  m-estimators  based  on  (smoothly)  monotone  V 
functions.  Asymptotic  normality  is  more  problematic,  and  under  slightly  stronger  regularity 
conditions,  Portnoy  shows  that  if  (p\ogpfl2/n-*Q  then  a  uniform  normal  approximation  is 
possible.  Note  that  this  essentially,  except  for  the  factor  {logpfl2,  verifies  Huber's  conjecture. 
Unfortunately,  Portnoy's  arguments  which  are  based  on  density  expansions  are  extremely 


6  Conditions  which  roughly  require  that  |xt  |/|  |-xr,  I  I  be  smoothly  distributed  on  the  unit 
sphere  in  Rp.  As  would  be  the  case  if  they  were  iid  and  had  a  nice  multivariate  density. 
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complex  and  delicate.    The  situation  is  somewhat  better  for  monotone  V>,  but  even  there  the 
sledding  is  rough. 

Perhaps  here  we  should  pause  to  reconsider  the  implications  for  the  wage  equation 
literature  considered  in  the  previous  section.  Recall  that  our  empirical  meta-modcl  of  wage- 
equations  implied  that  p*/n  was  roughly  constant  over  the  observed  range  of  sample  sizes. 
Thus,  the  Huber-Portnoy  results  would  appear  to  be  extremely  encouraging.  However,  we 
should  be  careful  to  remember  that  they  rely  on  certain  regularity  conditions  on  the  sequence 
of  designs  in  addition  to  the  rate  conditions  on  the  growth  of  p.  These  conditions  as  Portnoy 
shows  are  satisfied  by  design  sequences  drawn  at  random  from  a  distribution  "not  too  concen- 
trated in  any  fixed  directions."  This,  in  a  simpler  form,  already  arose  in  the  case  of  least 
squares  where  h-*0  implied  p/n—*Q  as  a  necessary  condition,  but  clearly  the  h  condition,  is 
much  more  stringent.  For  example  in  the  p  sample  design  it  requires  that  the  number  of 
observations  in  each  cell  tends  to  infinity  as  n— »oo. 
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